Improving Phrase-based Korean-Englis

نویسندگان

  • Jonghoon Lee
  • Donghyeon Lee
  • Gary Geunbae Lee
چکیده

In this paper, we describe several techniques to improve Korean-English statistical machine translation. We have built a phrase-based statistical machine translation system in a travel domain. On the baseline phrase-based system, several techniques are applied to improve the translation quality. Each technique can be applied or removed easily since the techniques are part of the preprocessing method or corpus processing method. Our experiments show that most of the techniques were successful except reordering the word sequence. The combination of the successful techniques has significantly improved the translation quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decoder-based Discriminative Training of Phrase Segmentation for Statistical Machine Translation

In this paper, we propose a new method of training phrase segmentation model for phrasebased statistical machine translation(SMT). We define a good segmentation as the segmentation producing a good translation. According to this definition, we propose a method that can discriminate between a good segmentation and a bad segmentation based on the translation quality. The proposed approach constru...

متن کامل

Probabilistic Language Model for Analyzing Korean Sentences

In this paper, we introduce a restricted form of phrase structure grammar to handle the characteristics of Korean more eeciently. Based on this restricted form of the grammar, we propose a probabilistic parser for Korean sentences. To show usefulness of the parser proposed in this paper, we made a preliminary experiment. We extract a set of rules from about 1,682 tree annotated sentences. The e...

متن کامل

Phrase database Approach to structural and semantic disambiguation in English-Korean Machine Translation

In machine translation it is common phenomenon that machine-readable dictionaries and standard parsing rules are not enough to ensure accuracy in parsing and translating English phrases into Korean language, which is revealed in misleading translation results due to consequent structural and semantic ambiguities. This paper aims to suggest a solution to structural and semantic ambiguities due t...

متن کامل

Non-adjacent segmental effects in tonal realization of accentual phrase in seoul Korean

This paper investigates the degree to which an onset consonant of an accentual phrase affects the f0 of the following syllables within the phrase in Seoul Korean. Korean tense or aspirated onset consonants raise f0 values of the following adjacent vowel, and when they are positioned on the first syllable onset of an accentual phrase, they continuously raise f0 values of the following non-adjace...

متن کامل

Korean Phrase Structure Grammar and Its Implementations into the LKB System

Though there exist various morphological analysers developed for Korean, no serious attempts have been made to build its syntactic or semantic parser(s), partly because of its structural complexity and partly because of the existence of no reliable grammar-build up system. This paper presents a result of our on-going project to build up a computationally feasible Korean Phrase Structure Grammar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006